Workshop on W3c's Multimodal Architecture and Interfaces Kinesthetic Input Modalities for the W3c Multimodal Architecture
نویسندگان
چکیده
Deutsche Telekom Laboratories and T-Systems recently developed various multimodal prototype applications and modules allowing kinesthetic input, i.e. devices that can be moved around in order to alter the applications’ state. We developed first prototypes were using proprietary technologies and principles, whereas latest demonstrators more and more follow the W3C’s Multimodal Architecture. This paper describes how we propose to integrate kinesthetic input into the multimodal framework as a new modality and discusses how this component fits into the W3C Multimodal Architecture. Introduction Mobile devices are getting more and more powerful while at the same time they are becoming smaller and smaller. Most of today’s mobile handsets are equipped with cameras and there are first mobile phones shipped with acceleration sensors. These developments enable new input paradigms and capabilities, which provide intuitive input modes. Deutsche Telekom Laboratories and T-Systems have integrated various motion sensors based e.g. on acceleration sensors or analysis of a camera picture into prototypical applications, which could potentially profit from input gestures, in which the whole device is moved around. The motion patterns (e.g. tilt up, tilt down, tilt left, tilt right) are derived from raw sensor or camera output using digital signal processing. The output patterns are mapped to corresponding events and may be used within the application control logic as user input events to manipulate the user interface. In our first experiments, tilt up and down were proven to be effective for menu and list navigation. The general interaction logic is shown in the picture below. It follows a “sphere” metaphor, in that the motion patterns can be thought of as laying in the directions indicated on a virtual sphere (“tilt up” means tilting the upper part of the device down along a sphere’s surface): This paper describes the technical integration of the motion detection sensor into the multimodal application and discusses the possibility to use it as a modality component within the W3C Multimodal Architecture. In analogy to the VoiceXML use case referred to in the W3C Working Draft on Multimodal Architecture and Interface [MMIARCH], we also introduce the cases of local and remote processing and discuss their advantages and disadvantages. Kinesthetic modality component The digital sensor (or camera) together with the digital signal processing can be seen as a modality component within the W3C Multimodal Architecture. It generates events corresponding to user motion
منابع مشابه
Spatial Audio with the W3C Architecture for Multimodal Interfaces
The development of multimodal applications is still hampered by the necessity to integrate various technologies and frameworks into a coherent application. In 2012, the W3C proposed a multimodal architecture, standardizing the overall structure and events passed between the constituting components in a multimodal application. In this paper, we present our experiences with implementing a multimo...
متن کاملMAUI: a Multimodal Affective User Interface Sensing User’s Emotions based on Appraisal Theory - Questions about Facial Expressions..
We are developing a Multimodal Affective User Interface (MAUI) framework shown in Figure 1 and described in [5], aimed at recognizing its users emotions by sensing their various user-centered modalities (or modes), and at giving the users context-aware feedback via an intelligent affective agent by using different agent-centered modes. The agent is built on an adaptive system architecture which...
متن کاملA Prolog Datamodel for State Chart XML
SCXML was proposed as one description language for dialog control in the W3C Multimodal Architecture but lacks the facilities required for grounding and reasoning. This prohibits the application of many dialog modeling techniques for multimodal applications following this W3C standard. By extending SCXML with a Prolog datamodel and scripting language, we enable those techniques to be employed a...
متن کاملOnline Multimodal Interaction for Speech Interpretation
In this paper, we describe an implementation of multimodal interaction for speech interpretation to enable access to the Web. As per W3C recommendation on 10 February 2009 the latest version of, EMMA is used for translation of speech signals into a format interpreted by the application language, greatly simplifying the process of adding multiple modes to an application. EMMA is used for annotat...
متن کاملMultimodal Interfaces - A Generic Design Approach
Integrating new input-output modalities, such as speech, gaze, gestures, haptics, etc., in user interfaces is currently considered as a significant potential contribution to implementing the concept of Universal Access (UA) in the Information Society; see (Oviatt, 2003), for instance. UA in this context means providing everybody, including handicapped users, with easy human-computer interaction...
متن کامل